DNA Identified Taxa List Dataset Technical Description Data collection and sampling was completed according to the Canadian Aquatic Biomonitoring Network (CABIN) Field Manual – Wadeable Streams (https://publications.gc.ca/site/eng/422979/publication.html) and Procedure for collecting benthic macroinvertebrate DNA samples in wadeable streams (https://stream-dna.com/wp-content/uploads/2021/03/Sampling-procedure-for-DNA_STREAM-v1.1.pdf). The use of these protocols in sampling benthic macroinvertebrates for DNA identification is represented by ‘DNA pilot’ in the protocol field. All benthic macroinvertebrate samples were processed and analyzed by the Hajibabaei Lab, Centre for Biodiversity Genomics, University of Guelph ((http://hajibabaei.ibarcode.org/)) using DNA metabarcoding technology to produce DNA sequences. These sequences were then processed through the MetaWorks pipeline to assign taxonomy (Porter and Hajibabaei, 2022). The genus-level DNA identified taxa lists were formatted by ECCC to meet CABIN database requirements including reducing the taxa list to aquatic macroinvertebrates (e.g., removing terrestrial taxa) relevant to the Canadian aquatic environments (i.e., as per CABIN Laboratory Methods: Processing, Taxonomy and Quality Control of Benthic Macroinvertebrate samples, January 2020 (https://publications.gc.ca/site/eng/9.895039/publication.html)), standardization of taxonomic names (as per the Integrated Taxonomic Information System (https://www.itis.gov/)) and requirements for data formatting. There are three data files for this dataset including benthic invertebrate, study information and habitat data .csv files. * The Benthic data.csv file consists of genus level taxa lists where a taxa present is indicated by the value 1 in the Count column. If a taxa is not listed or 0, it was not detected in the sample through metabarcoding analysis. Other information includes taxonomist, and ITIS values. Note that both the SubSample and Total Sample are 100 as the whole sample was processed for metabarcoding analysis. * The Study data.csv file provides study information (including study purpose and description), sampling date, location, site description (including data authority) and alternative site codes. * The Habitat data.csv file provides protocol, and variable information including variable type, description and value. Note: if no habitat information is shown please refer to the Alternative Site Code field in the Study data .csv file for the CABIN study where this information is available. Further information about these data files: * Individual samples are identified by unique site codes, sampling date, and in the case of multiple samples on a given date, sample numbers (e.g. Sample Number 1, Sample Number 2, Sample Number 3). * There are two types of sites in this dataset (reference and test). The status of reference and test sites is determined by the respective Data Authority. For more information about the criteria of reference or test status of a given site please contact the Data Authority (identified in the Site Description) for further information. * Where paired morphological and DNA benthic macroinvertebrate samples were collected, the morphological taxonomic and habitat data is stored separately under the relevant CABIN study which is cross referenced in the Alternate Site Code field (the alternative study will be listed). Open Data for these paired samples can be found on the CABIN Open Government Portal (https://open.canada.ca/data/en/dataset/13564ca4-e330-40a5-9521-bfb1be767147). * Data Authorities for each sample are identified in the respective Site Description field. References Porter, T. M., & Hajibabaei, M. (2022). MetaWorks: A flexible, scalable bioinformatic pipeline for high-throughput multi-marker biodiversity assessments. PLOS ONE, 17(9), e0274260. doi: 10.1371/journal.pone.0274260